Deep learning-based medical image analysis technology has been developed to the extent that it shows an accuracy surpassing the ability of a human radiologist in some tasks. However, data labeling on medical images requires human experts and a great deal of time and expense. Moreover, medical image data usually have an imbalanced distribution for each disease. In particular, in multilabel classification, learning with a small number of labeled data causes overfitting problems. The model easily overfits the limited number of labeled data, while it still underfits the large amount of unlabeled data. In this study, we propose a method that combines entropy-based Mixup and self-training to improve the performance of data-imbalanced chest X-ray classification. The proposed method is to apply the Mixup algorithm to limited labeled data to alleviate the data imbalance problem and perform self-training that effectively utilizes the unlabeled data while iterating this process by replacing the teacher model with the student model. Experimental results in an environment with a limited number of labeled data and a large number of unlabeled data showed that the classification performance was improved by combining entropy-based Mixup and self-training.
Loading....